From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Moore Date: Fri, 09 Oct 2009 06:42:01 +0200 Subject: [U-Boot] [PATCH V2 1/3] memcpy: copy one word at a time if possible In-Reply-To: <20091008204431.07341E8B31D@gemini.denx.de> References: <45d5e3a574bf4844f46f50b2c88054a5b28f973b.1255000877.git.rubini@ unipv.it> <20091008204431.07341E8B31D@gemini.denx.de> Message-ID: <4ACEBF19.4010902@free.fr> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de Wolfgang Denk a ?crit : > I think we should change this if-else into a plain if, something like > that: > > void * memcpy(void *dest, const void *src, size_t count) > { > char *tmp = (char *) dest, *s = (char *) src; > char *d8 = (char *)dest, *s8 = (char *)src; > unsigned long *dl = (unsigned long *)dest, *sl = (unsigned long *)src; > > /* while all data is aligned (common case), copy a word at a time */ > if ( (((int)dest | (int)src | count) & (sizeof(long) - 1)) == 0) { > while (count) { > *dl++ = *sl++; > count -= sizeof(unsigned long); > } > } > while (count--) > *d8++ = *s8++; > > return dest; > } > > This way we can have both - the "long" copy of a potential aligne > dfirst part, and the byte copy of any trailing (or unaligned) part. > > I agree wholeheartedly with the idea but shouldn't it be more like this (untested) code : void * memcpy(void *dest, const void *src, size_t count) { char *d8, *s8; unsigned long *dl = dest, *sl = src; /* while all data is aligned (common case), copy multiple bytes at a time */ if ( (((int)(long)dest | (int)(long)src) & (sizeof(*dl) - 1)) == 0) { while (count >= sizeof(*dl)) { *dl++ = *sl++; count -= sizeof(*dl); } } d8 = (char *)dl; s8 = (char *)sl; /* copy any remaining data byte by byte */ while (count--) *d8++ = *s8++; return dest; } Remarks : 1) My curious (int) (long) pointer casts are intended to avoid compiler warnings while avoiding unnecessary calculations in long. On some architectures long calculations are less efficient than int ones. In fact I wonder whether, on such architectures, it might not also be better to perform the copy with int size chunks. 2) Personally I prefer sizeof(*dl) to sizeof(unsigned long) as there is less risk of error if the type of the chunks is changed. 3) In C (but not in C++) I think the casts from void * to unsigned long * are unnecessary. But as I said all this is completely untested :( Cheers, Chris