public inbox for u-boot@lists.denx.de
 help / color / mirror / Atom feed
* [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions
@ 2010-12-15 15:13 Rob Herring
  2010-12-17 20:21 ` Wolfgang Denk
  2010-12-17 21:27 ` Måns Rullgård
  0 siblings, 2 replies; 7+ messages in thread
From: Rob Herring @ 2010-12-15 15:13 UTC (permalink / raw)
  To: u-boot

From: Rob Herring <rob.herring@calxeda.com>

swab functions are heavily used by FDT code, so enable
optimized assembly code for ARMv6 and later.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
---
 arch/arm/include/asm/byteorder.h |   16 ++++++++++++++++
 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/arch/arm/include/asm/byteorder.h b/arch/arm/include/asm/byteorder.h
index c3489f1..9df5844 100644
--- a/arch/arm/include/asm/byteorder.h
+++ b/arch/arm/include/asm/byteorder.h
@@ -23,6 +23,22 @@
 #  define __SWAB_64_THRU_32__
 #endif
 
+#if defined(__ARM_ARCH_7A__) || defined(__ARM_ARCH_6__)
+static inline __u16 __attribute__((const)) ___arch_swab16(__u16 x)
+{
+	__asm__ ("rev16 %0, %1" : "=r" (x) : "r" (x));
+	return x;
+}
+#define __arch_swab16 ___arch_swab16
+
+static inline __u32 __attribute__((const)) ___arch_swab32(__u32 x)
+{
+	__asm__ ("rev %0, %1" : "=r" (x) : "r" (x));
+	return x;
+}
+#define __arch_swab32 ___arch_swab32
+#endif
+
 #ifdef __ARMEB__
 #include <linux/byteorder/big_endian.h>
 #else
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions
  2010-12-15 15:13 [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions Rob Herring
@ 2010-12-17 20:21 ` Wolfgang Denk
  2010-12-17 20:52   ` Rob Herring
  2010-12-17 21:27 ` Måns Rullgård
  1 sibling, 1 reply; 7+ messages in thread
From: Wolfgang Denk @ 2010-12-17 20:21 UTC (permalink / raw)
  To: u-boot

Dear Rob Herring,

In message <1292425994-24331-1-git-send-email-robherring2@gmail.com> you wrote:
> From: Rob Herring <rob.herring@calxeda.com>
> 
> swab functions are heavily used by FDT code, so enable
> optimized assembly code for ARMv6 and later.
> 
> Signed-off-by: Rob Herring <rob.herring@calxeda.com>
> ---
>  arch/arm/include/asm/byteorder.h |   16 ++++++++++++++++
>  1 files changed, 16 insertions(+), 0 deletions(-)

Do you have any numbers if this changes gives any measurable
improvement?

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
A modem is a baudy house.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions
  2010-12-17 20:21 ` Wolfgang Denk
@ 2010-12-17 20:52   ` Rob Herring
  0 siblings, 0 replies; 7+ messages in thread
From: Rob Herring @ 2010-12-17 20:52 UTC (permalink / raw)
  To: u-boot

Wolfgang,

On 12/17/2010 02:21 PM, Wolfgang Denk wrote:
> Dear Rob Herring,
>
> In message<1292425994-24331-1-git-send-email-robherring2@gmail.com>  you wrote:
>> From: Rob Herring<rob.herring@calxeda.com>
>>
>> swab functions are heavily used by FDT code, so enable
>> optimized assembly code for ARMv6 and later.
>>
>> Signed-off-by: Rob Herring<rob.herring@calxeda.com>
>> ---
>>   arch/arm/include/asm/byteorder.h |   16 ++++++++++++++++
>>   1 files changed, 16 insertions(+), 0 deletions(-)
>
> Do you have any numbers if this changes gives any measurable
> improvement?

I have an instruction trace capture and see repeated calls to swab32 by 
the fdt code. It's an obvious low hanging fruit. The boot time for 
device tree vs. non-device tree is noticeably longer, but I don't have 
any formal measurements.

Rob

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions
  2010-12-15 15:13 [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions Rob Herring
  2010-12-17 20:21 ` Wolfgang Denk
@ 2010-12-17 21:27 ` Måns Rullgård
  2010-12-18 17:12   ` Rob Herring
  1 sibling, 1 reply; 7+ messages in thread
From: Måns Rullgård @ 2010-12-17 21:27 UTC (permalink / raw)
  To: u-boot

Rob Herring <robherring2@gmail.com> writes:

> From: Rob Herring <rob.herring@calxeda.com>
>
> swab functions are heavily used by FDT code, so enable
> optimized assembly code for ARMv6 and later.
>
> Signed-off-by: Rob Herring <rob.herring@calxeda.com>
> ---
>  arch/arm/include/asm/byteorder.h |   16 ++++++++++++++++
>  1 files changed, 16 insertions(+), 0 deletions(-)
>
> diff --git a/arch/arm/include/asm/byteorder.h b/arch/arm/include/asm/byteorder.h
> index c3489f1..9df5844 100644
> --- a/arch/arm/include/asm/byteorder.h
> +++ b/arch/arm/include/asm/byteorder.h
> @@ -23,6 +23,22 @@
>  #  define __SWAB_64_THRU_32__
>  #endif
>
> +#if defined(__ARM_ARCH_7A__) || defined(__ARM_ARCH_6__)
> +static inline __u16 __attribute__((const)) ___arch_swab16(__u16 x)
> +{
> +	__asm__ ("rev16 %0, %1" : "=r" (x) : "r" (x));
> +	return x;
> +}

Pay close attention to what gcc does with this as it is prone to add
unnecessary masking of the low halfword.  If the callers are
well-behaved (argument having top halfword clear), making the
parameter and return types here plain unsigned (or u32) gives better
code.

-- 
M?ns Rullg?rd
mans at mansr.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions
  2010-12-17 21:27 ` Måns Rullgård
@ 2010-12-18 17:12   ` Rob Herring
  2010-12-18 18:17     ` Måns Rullgård
  2010-12-18 21:59     ` Wolfgang Denk
  0 siblings, 2 replies; 7+ messages in thread
From: Rob Herring @ 2010-12-18 17:12 UTC (permalink / raw)
  To: u-boot

On 12/17/2010 03:27 PM, M?ns Rullg?rd wrote:
> Rob Herring<robherring2@gmail.com>  writes:
>
>> From: Rob Herring<rob.herring@calxeda.com>
>>
>> swab functions are heavily used by FDT code, so enable
>> optimized assembly code for ARMv6 and later.
>>
>> Signed-off-by: Rob Herring<rob.herring@calxeda.com>
>> ---
>>   arch/arm/include/asm/byteorder.h |   16 ++++++++++++++++
>>   1 files changed, 16 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/byteorder.h b/arch/arm/include/asm/byteorder.h
>> index c3489f1..9df5844 100644
>> --- a/arch/arm/include/asm/byteorder.h
>> +++ b/arch/arm/include/asm/byteorder.h
>> @@ -23,6 +23,22 @@
>>   #  define __SWAB_64_THRU_32__
>>   #endif
>>
>> +#if defined(__ARM_ARCH_7A__) || defined(__ARM_ARCH_6__)
>> +static inline __u16 __attribute__((const)) ___arch_swab16(__u16 x)
>> +{
>> +	__asm__ ("rev16 %0, %1" : "=r" (x) : "r" (x));
>> +	return x;
>> +}
>
> Pay close attention to what gcc does with this as it is prone to add
> unnecessary masking of the low halfword.  If the callers are
> well-behaved (argument having top halfword clear), making the
> parameter and return types here plain unsigned (or u32) gives better
> code.

This straight from the Linux code and there are only a few users of 
swab16 (none in my build).

Rob

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions
  2010-12-18 17:12   ` Rob Herring
@ 2010-12-18 18:17     ` Måns Rullgård
  2010-12-18 21:59     ` Wolfgang Denk
  1 sibling, 0 replies; 7+ messages in thread
From: Måns Rullgård @ 2010-12-18 18:17 UTC (permalink / raw)
  To: u-boot

Rob Herring <robherring2@gmail.com> writes:

> On 12/17/2010 03:27 PM, M?ns Rullg?rd wrote:
>> Rob Herring<robherring2@gmail.com>  writes:
>>
>>> From: Rob Herring<rob.herring@calxeda.com>
>>>
>>> swab functions are heavily used by FDT code, so enable
>>> optimized assembly code for ARMv6 and later.
>>>
>>> Signed-off-by: Rob Herring<rob.herring@calxeda.com>
>>> ---
>>>   arch/arm/include/asm/byteorder.h |   16 ++++++++++++++++
>>>   1 files changed, 16 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/arch/arm/include/asm/byteorder.h b/arch/arm/include/asm/byteorder.h
>>> index c3489f1..9df5844 100644
>>> --- a/arch/arm/include/asm/byteorder.h
>>> +++ b/arch/arm/include/asm/byteorder.h
>>> @@ -23,6 +23,22 @@
>>>   #  define __SWAB_64_THRU_32__
>>>   #endif
>>>
>>> +#if defined(__ARM_ARCH_7A__) || defined(__ARM_ARCH_6__)
>>> +static inline __u16 __attribute__((const)) ___arch_swab16(__u16 x)
>>> +{
>>> +	__asm__ ("rev16 %0, %1" : "=r" (x) : "r" (x));
>>> +	return x;
>>> +}
>>
>> Pay close attention to what gcc does with this as it is prone to add
>> unnecessary masking of the low halfword.  If the callers are
>> well-behaved (argument having top halfword clear), making the
>> parameter and return types here plain unsigned (or u32) gives better
>> code.
>
> This straight from the Linux code and there are only a few users of 
> swab16 (none in my build).

Look at the generated code if you don't believe me.

-- 
M?ns Rullg?rd
mans at mansr.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions
  2010-12-18 17:12   ` Rob Herring
  2010-12-18 18:17     ` Måns Rullgård
@ 2010-12-18 21:59     ` Wolfgang Denk
  1 sibling, 0 replies; 7+ messages in thread
From: Wolfgang Denk @ 2010-12-18 21:59 UTC (permalink / raw)
  To: u-boot

Dear Rob Herring,

In message <4D0CEB67.2040502@gmail.com> you wrote:
>
> This straight from the Linux code and there are only a few users of
> swab16 (none in my build).

Given that we have no idea if this code really gives any measurable
performance improvement, and that it appears to be dangerous as well,
I tend to not include that as is.

Thanks.


Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
It is surely a great calamity for  a  human  being  to  have  no  ob-
sessions.                                                - Robert Bly

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-12-18 21:59 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-12-15 15:13 [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions Rob Herring
2010-12-17 20:21 ` Wolfgang Denk
2010-12-17 20:52   ` Rob Herring
2010-12-17 21:27 ` Måns Rullgård
2010-12-18 17:12   ` Rob Herring
2010-12-18 18:17     ` Måns Rullgård
2010-12-18 21:59     ` Wolfgang Denk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox