public inbox for u-boot@lists.denx.de
 help / color / mirror / Atom feed
* [U-Boot] [PATCH V3 1/3] lib_generic memcpy: copy one word at a time if possible
@ 2009-10-09  9:12 Alessandro Rubini
  2009-10-09 10:21 ` Mike Frysinger
  2009-10-10  6:13 ` Chris Moore
  0 siblings, 2 replies; 6+ messages in thread
From: Alessandro Rubini @ 2009-10-09  9:12 UTC (permalink / raw)
  To: u-boot

From: Alessandro Rubini <rubini@unipv.it>

If source and destination are aligned, this copies ulong values
until possible, trailing part is copied by byte. Thanks for the details
to Wolfgang Denk, Mike Frysinger, Peter Tyser, Chris Moore.

Signed-off-by: Alessandro Rubini <rubini@unipv.it>
Acked-by: Andrea Gallo <andrea.gallo@stericsson.com>
---
 lib_generic/string.c |   19 +++++++++++++++----
 1 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/lib_generic/string.c b/lib_generic/string.c
index 181eda6..9bee79b 100644
--- a/lib_generic/string.c
+++ b/lib_generic/string.c
@@ -446,12 +446,23 @@ char * bcopy(const char * src, char * dest, int count)
  * You should not use this function to access IO space, use memcpy_toio()
  * or memcpy_fromio() instead.
  */
-void * memcpy(void * dest,const void *src,size_t count)
+void * memcpy(void *dest, const void *src, size_t count)
 {
-	char *tmp = (char *) dest, *s = (char *) src;
-
+	unsigned long *dl = (unsigned long *)dest, *sl = (unsigned long *)src;
+	char *d8, *s8;
+
+	/* while all data is aligned (common case), copy a word at a time */
+	if ( (((ulong)dest | (ulong)src | count) & (sizeof(*dl) - 1)) == 0) {
+		while (count >= sizeof(*dl)) {
+			*dl++ = *sl++;
+			count -= sizeof(*dl);
+		}
+	}
+	/* copy the reset one byte at a time */
+	d8 = (char *)dl;
+	s8 = (char *)sl;
 	while (count--)
-		*tmp++ = *s++;
+		*d8++ = *s8++;
 
 	return dest;
 }
-- 
1.5.6.5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [U-Boot] [PATCH V3 1/3] lib_generic memcpy: copy one word at a time if possible
  2009-10-09  9:12 [U-Boot] [PATCH V3 1/3] lib_generic memcpy: copy one word at a time if possible Alessandro Rubini
@ 2009-10-09 10:21 ` Mike Frysinger
  2009-10-09 10:50   ` Alessandro Rubini
  2009-10-10  6:13 ` Chris Moore
  1 sibling, 1 reply; 6+ messages in thread
From: Mike Frysinger @ 2009-10-09 10:21 UTC (permalink / raw)
  To: u-boot

On Friday 09 October 2009 05:12:20 Alessandro Rubini wrote:
> +	/* while all data is aligned (common case), copy a word at a time */
> +	if ( (((ulong)dest | (ulong)src | count) & (sizeof(*dl) - 1)) == 0) {

i think you want to drop the count from the list, otherwise we dont consume 
the leading groups of 4 bytes if count isnt a multiple of 4.

otherwise, it's pretty crazy how generic yet optimized the resulting C should 
be :).  acked-by-me.
-mike
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part.
Url : http://lists.denx.de/pipermail/u-boot/attachments/20091009/cef8f0d0/attachment.pgp 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [U-Boot] [PATCH V3 1/3] lib_generic memcpy: copy one word at a time if possible
  2009-10-09 10:21 ` Mike Frysinger
@ 2009-10-09 10:50   ` Alessandro Rubini
  0 siblings, 0 replies; 6+ messages in thread
From: Alessandro Rubini @ 2009-10-09 10:50 UTC (permalink / raw)
  To: u-boot

> i think you want to drop the count from the list, otherwise we dont consume
> the leading groups of 4 bytes if count isnt a multiple of 4.

Yes, same for memset.  See Wolfgang it was not 10% more? These micro
optimizations are hairy, as you need to measure them to make sure they
work.

Ok, V4 tomorrow.
/alessandro

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [U-Boot] [PATCH V3 1/3] lib_generic memcpy: copy one word at a time if possible
  2009-10-09  9:12 [U-Boot] [PATCH V3 1/3] lib_generic memcpy: copy one word at a time if possible Alessandro Rubini
  2009-10-09 10:21 ` Mike Frysinger
@ 2009-10-10  6:13 ` Chris Moore
  2009-10-10  7:44   ` Alessandro Rubini
  1 sibling, 1 reply; 6+ messages in thread
From: Chris Moore @ 2009-10-10  6:13 UTC (permalink / raw)
  To: u-boot

Alessandro Rubini a ?crit :
> From: Alessandro Rubini <rubini@unipv.it>
>
> If source and destination are aligned, this copies ulong values
> until possible, trailing part is copied by byte. Thanks for the details
> to Wolfgang Denk, Mike Frysinger, Peter Tyser, Chris Moore.
>
> Signed-off-by: Alessandro Rubini <rubini@unipv.it>
> Acked-by: Andrea Gallo <andrea.gallo@stericsson.com>
> ---
>  lib_generic/string.c |   19 +++++++++++++++----
>  1 files changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/lib_generic/string.c b/lib_generic/string.c
> index 181eda6..9bee79b 100644
> --- a/lib_generic/string.c
> +++ b/lib_generic/string.c
> @@ -446,12 +446,23 @@ char * bcopy(const char * src, char * dest, int count)
>   * You should not use this function to access IO space, use memcpy_toio()
>   * or memcpy_fromio() instead.
>   */
> -void * memcpy(void * dest,const void *src,size_t count)
> +void * memcpy(void *dest, const void *src, size_t count)
>  {
> -	char *tmp = (char *) dest, *s = (char *) src;
> -
> +	unsigned long *dl = (unsigned long *)dest, *sl = (unsigned long *)src;
>   

Nitpick: Are you sure the casts are necessary here ?
I am not sure about U-Boot but Linux kernel maintainers frown on 
unnecessary casts.

> +	char *d8, *s8;
> +
> +	/* while all data is aligned (common case), copy a word at a time */
> +	if ( (((ulong)dest | (ulong)src | count) & (sizeof(*dl) - 1)) == 0) {
>   

+	if ( (((ulong)dest | (ulong)src) & (sizeof(*dl) - 1)) == 0) {

The "or" should not include count: the remaining count % sizeof(unsigned 
long) bytes are copied below.

> +		while (count >= sizeof(*dl)) {
> +			*dl++ = *sl++;
> +			count -= sizeof(*dl);
> +		}
> +	}
> +	/* copy the reset one byte at a time */
>   

Nitpick: s/reset/rest/

> +	d8 = (char *)dl;
> +	s8 = (char *)sl;
>  	while (count--)
> -		*tmp++ = *s++;
> +		*d8++ = *s8++;
>  
>  	return dest;
>  }
>   

Cheers,
Chris

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [U-Boot] [PATCH V3 1/3] lib_generic memcpy: copy one word at a time if possible
  2009-10-10  6:13 ` Chris Moore
@ 2009-10-10  7:44   ` Alessandro Rubini
  2009-10-12  4:37     ` Chris Moore
  0 siblings, 1 reply; 6+ messages in thread
From: Alessandro Rubini @ 2009-10-10  7:44 UTC (permalink / raw)
  To: u-boot

Hello Chris

>> +	unsigned long *dl = (unsigned long *)dest, *sl = (unsigned long *)src;

> Nitpick: Are you sure the casts are necessary here ?

Without the one on src it complains because of "const". So I write
both for symetry.

> +	if ( (((ulong)dest | (ulong)src) & (sizeof(*dl) - 1)) == 0) {
> 
> The "or" should not include count: the remaining count % sizeof(unsigned 
> long) bytes are copied below.

Yes, that's why I'm sending V4 today. Actually, I booted V3 but didn't
measure it, so this bug went unnoticed. But I won't measure it today,
either...

Ok for spaces around operators (even if the whole of string.c is
strangely spaced, but that's historical).

thanks
/alessandro

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [U-Boot] [PATCH V3 1/3] lib_generic memcpy: copy one word at a time if possible
  2009-10-10  7:44   ` Alessandro Rubini
@ 2009-10-12  4:37     ` Chris Moore
  0 siblings, 0 replies; 6+ messages in thread
From: Chris Moore @ 2009-10-12  4:37 UTC (permalink / raw)
  To: u-boot

Hi Alessandro,

Alessandro Rubini a ?crit :
>>> +	unsigned long *dl = (unsigned long *)dest, *sl = (unsigned long *)src;
>>>       
>
>   
>> Nitpick: Are you sure the casts are necessary here ?
>>     
>
> Without the one on src it complains because of "const". So I write
> both for symetry.
>   

Yes, of course, you and the compiler are absolutely right.
Silly me, I completely overlooked the const :(
To me casting away const is a cardinal sin (it rather defeats the 
object) and I find that the ease with which it can be done in C is 
frightening.
So I feel obliged to (hopefully) correct my submission:

void *memcpy(void *dest, const void *src, size_t count)
{
	char *d8;
	const char *s8;
	unsigned long *dl = dest;
	const unsigned long *sl = src;

	/* while all data is aligned (common case), copy multiple bytes at a time */
	if ( (((int)(long)dest | (int)(long)src) & (sizeof(*dl) - 1)) == 0) {
		while (count >= sizeof(*dl)) {
			*dl++ = *sl++;
			count -= sizeof(*dl);
		}
	}

	d8 = (char *)dl;
	s8 = (const char *)sl;

	/* copy any remaining data byte by byte */
	while (count--)
		*d8++ = *s8++;

	return dest;
}

But this is still not even compile tested :(
I never learn :(

Cheers,
Chris

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-10-12  4:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-09  9:12 [U-Boot] [PATCH V3 1/3] lib_generic memcpy: copy one word at a time if possible Alessandro Rubini
2009-10-09 10:21 ` Mike Frysinger
2009-10-09 10:50   ` Alessandro Rubini
2009-10-10  6:13 ` Chris Moore
2009-10-10  7:44   ` Alessandro Rubini
2009-10-12  4:37     ` Chris Moore

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox