* [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions
@ 2010-12-15 15:13 Rob Herring
2010-12-17 20:21 ` Wolfgang Denk
2010-12-17 21:27 ` Måns Rullgård
0 siblings, 2 replies; 7+ messages in thread
From: Rob Herring @ 2010-12-15 15:13 UTC (permalink / raw)
To: u-boot
From: Rob Herring <rob.herring@calxeda.com>
swab functions are heavily used by FDT code, so enable
optimized assembly code for ARMv6 and later.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
---
arch/arm/include/asm/byteorder.h | 16 ++++++++++++++++
1 files changed, 16 insertions(+), 0 deletions(-)
diff --git a/arch/arm/include/asm/byteorder.h b/arch/arm/include/asm/byteorder.h
index c3489f1..9df5844 100644
--- a/arch/arm/include/asm/byteorder.h
+++ b/arch/arm/include/asm/byteorder.h
@@ -23,6 +23,22 @@
# define __SWAB_64_THRU_32__
#endif
+#if defined(__ARM_ARCH_7A__) || defined(__ARM_ARCH_6__)
+static inline __u16 __attribute__((const)) ___arch_swab16(__u16 x)
+{
+ __asm__ ("rev16 %0, %1" : "=r" (x) : "r" (x));
+ return x;
+}
+#define __arch_swab16 ___arch_swab16
+
+static inline __u32 __attribute__((const)) ___arch_swab32(__u32 x)
+{
+ __asm__ ("rev %0, %1" : "=r" (x) : "r" (x));
+ return x;
+}
+#define __arch_swab32 ___arch_swab32
+#endif
+
#ifdef __ARMEB__
#include <linux/byteorder/big_endian.h>
#else
--
1.7.1
^ permalink raw reply related [flat|nested] 7+ messages in thread* [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions
2010-12-15 15:13 [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions Rob Herring
@ 2010-12-17 20:21 ` Wolfgang Denk
2010-12-17 20:52 ` Rob Herring
2010-12-17 21:27 ` Måns Rullgård
1 sibling, 1 reply; 7+ messages in thread
From: Wolfgang Denk @ 2010-12-17 20:21 UTC (permalink / raw)
To: u-boot
Dear Rob Herring,
In message <1292425994-24331-1-git-send-email-robherring2@gmail.com> you wrote:
> From: Rob Herring <rob.herring@calxeda.com>
>
> swab functions are heavily used by FDT code, so enable
> optimized assembly code for ARMv6 and later.
>
> Signed-off-by: Rob Herring <rob.herring@calxeda.com>
> ---
> arch/arm/include/asm/byteorder.h | 16 ++++++++++++++++
> 1 files changed, 16 insertions(+), 0 deletions(-)
Do you have any numbers if this changes gives any measurable
improvement?
Best regards,
Wolfgang Denk
--
DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
A modem is a baudy house.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions
2010-12-17 20:21 ` Wolfgang Denk
@ 2010-12-17 20:52 ` Rob Herring
0 siblings, 0 replies; 7+ messages in thread
From: Rob Herring @ 2010-12-17 20:52 UTC (permalink / raw)
To: u-boot
Wolfgang,
On 12/17/2010 02:21 PM, Wolfgang Denk wrote:
> Dear Rob Herring,
>
> In message<1292425994-24331-1-git-send-email-robherring2@gmail.com> you wrote:
>> From: Rob Herring<rob.herring@calxeda.com>
>>
>> swab functions are heavily used by FDT code, so enable
>> optimized assembly code for ARMv6 and later.
>>
>> Signed-off-by: Rob Herring<rob.herring@calxeda.com>
>> ---
>> arch/arm/include/asm/byteorder.h | 16 ++++++++++++++++
>> 1 files changed, 16 insertions(+), 0 deletions(-)
>
> Do you have any numbers if this changes gives any measurable
> improvement?
I have an instruction trace capture and see repeated calls to swab32 by
the fdt code. It's an obvious low hanging fruit. The boot time for
device tree vs. non-device tree is noticeably longer, but I don't have
any formal measurements.
Rob
^ permalink raw reply [flat|nested] 7+ messages in thread
* [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions
2010-12-15 15:13 [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions Rob Herring
2010-12-17 20:21 ` Wolfgang Denk
@ 2010-12-17 21:27 ` Måns Rullgård
2010-12-18 17:12 ` Rob Herring
1 sibling, 1 reply; 7+ messages in thread
From: Måns Rullgård @ 2010-12-17 21:27 UTC (permalink / raw)
To: u-boot
Rob Herring <robherring2@gmail.com> writes:
> From: Rob Herring <rob.herring@calxeda.com>
>
> swab functions are heavily used by FDT code, so enable
> optimized assembly code for ARMv6 and later.
>
> Signed-off-by: Rob Herring <rob.herring@calxeda.com>
> ---
> arch/arm/include/asm/byteorder.h | 16 ++++++++++++++++
> 1 files changed, 16 insertions(+), 0 deletions(-)
>
> diff --git a/arch/arm/include/asm/byteorder.h b/arch/arm/include/asm/byteorder.h
> index c3489f1..9df5844 100644
> --- a/arch/arm/include/asm/byteorder.h
> +++ b/arch/arm/include/asm/byteorder.h
> @@ -23,6 +23,22 @@
> # define __SWAB_64_THRU_32__
> #endif
>
> +#if defined(__ARM_ARCH_7A__) || defined(__ARM_ARCH_6__)
> +static inline __u16 __attribute__((const)) ___arch_swab16(__u16 x)
> +{
> + __asm__ ("rev16 %0, %1" : "=r" (x) : "r" (x));
> + return x;
> +}
Pay close attention to what gcc does with this as it is prone to add
unnecessary masking of the low halfword. If the callers are
well-behaved (argument having top halfword clear), making the
parameter and return types here plain unsigned (or u32) gives better
code.
--
M?ns Rullg?rd
mans at mansr.com
^ permalink raw reply [flat|nested] 7+ messages in thread* [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions
2010-12-17 21:27 ` Måns Rullgård
@ 2010-12-18 17:12 ` Rob Herring
2010-12-18 18:17 ` Måns Rullgård
2010-12-18 21:59 ` Wolfgang Denk
0 siblings, 2 replies; 7+ messages in thread
From: Rob Herring @ 2010-12-18 17:12 UTC (permalink / raw)
To: u-boot
On 12/17/2010 03:27 PM, M?ns Rullg?rd wrote:
> Rob Herring<robherring2@gmail.com> writes:
>
>> From: Rob Herring<rob.herring@calxeda.com>
>>
>> swab functions are heavily used by FDT code, so enable
>> optimized assembly code for ARMv6 and later.
>>
>> Signed-off-by: Rob Herring<rob.herring@calxeda.com>
>> ---
>> arch/arm/include/asm/byteorder.h | 16 ++++++++++++++++
>> 1 files changed, 16 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/byteorder.h b/arch/arm/include/asm/byteorder.h
>> index c3489f1..9df5844 100644
>> --- a/arch/arm/include/asm/byteorder.h
>> +++ b/arch/arm/include/asm/byteorder.h
>> @@ -23,6 +23,22 @@
>> # define __SWAB_64_THRU_32__
>> #endif
>>
>> +#if defined(__ARM_ARCH_7A__) || defined(__ARM_ARCH_6__)
>> +static inline __u16 __attribute__((const)) ___arch_swab16(__u16 x)
>> +{
>> + __asm__ ("rev16 %0, %1" : "=r" (x) : "r" (x));
>> + return x;
>> +}
>
> Pay close attention to what gcc does with this as it is prone to add
> unnecessary masking of the low halfword. If the callers are
> well-behaved (argument having top halfword clear), making the
> parameter and return types here plain unsigned (or u32) gives better
> code.
This straight from the Linux code and there are only a few users of
swab16 (none in my build).
Rob
^ permalink raw reply [flat|nested] 7+ messages in thread* [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions
2010-12-18 17:12 ` Rob Herring
@ 2010-12-18 18:17 ` Måns Rullgård
2010-12-18 21:59 ` Wolfgang Denk
1 sibling, 0 replies; 7+ messages in thread
From: Måns Rullgård @ 2010-12-18 18:17 UTC (permalink / raw)
To: u-boot
Rob Herring <robherring2@gmail.com> writes:
> On 12/17/2010 03:27 PM, M?ns Rullg?rd wrote:
>> Rob Herring<robherring2@gmail.com> writes:
>>
>>> From: Rob Herring<rob.herring@calxeda.com>
>>>
>>> swab functions are heavily used by FDT code, so enable
>>> optimized assembly code for ARMv6 and later.
>>>
>>> Signed-off-by: Rob Herring<rob.herring@calxeda.com>
>>> ---
>>> arch/arm/include/asm/byteorder.h | 16 ++++++++++++++++
>>> 1 files changed, 16 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/arch/arm/include/asm/byteorder.h b/arch/arm/include/asm/byteorder.h
>>> index c3489f1..9df5844 100644
>>> --- a/arch/arm/include/asm/byteorder.h
>>> +++ b/arch/arm/include/asm/byteorder.h
>>> @@ -23,6 +23,22 @@
>>> # define __SWAB_64_THRU_32__
>>> #endif
>>>
>>> +#if defined(__ARM_ARCH_7A__) || defined(__ARM_ARCH_6__)
>>> +static inline __u16 __attribute__((const)) ___arch_swab16(__u16 x)
>>> +{
>>> + __asm__ ("rev16 %0, %1" : "=r" (x) : "r" (x));
>>> + return x;
>>> +}
>>
>> Pay close attention to what gcc does with this as it is prone to add
>> unnecessary masking of the low halfword. If the callers are
>> well-behaved (argument having top halfword clear), making the
>> parameter and return types here plain unsigned (or u32) gives better
>> code.
>
> This straight from the Linux code and there are only a few users of
> swab16 (none in my build).
Look at the generated code if you don't believe me.
--
M?ns Rullg?rd
mans at mansr.com
^ permalink raw reply [flat|nested] 7+ messages in thread* [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions
2010-12-18 17:12 ` Rob Herring
2010-12-18 18:17 ` Måns Rullgård
@ 2010-12-18 21:59 ` Wolfgang Denk
1 sibling, 0 replies; 7+ messages in thread
From: Wolfgang Denk @ 2010-12-18 21:59 UTC (permalink / raw)
To: u-boot
Dear Rob Herring,
In message <4D0CEB67.2040502@gmail.com> you wrote:
>
> This straight from the Linux code and there are only a few users of
> swab16 (none in my build).
Given that we have no idea if this code really gives any measurable
performance improvement, and that it appears to be dangerous as well,
I tend to not include that as is.
Thanks.
Wolfgang Denk
--
DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
It is surely a great calamity for a human being to have no ob-
sessions. - Robert Bly
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-12-18 21:59 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-12-15 15:13 [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions Rob Herring
2010-12-17 20:21 ` Wolfgang Denk
2010-12-17 20:52 ` Rob Herring
2010-12-17 21:27 ` Måns Rullgård
2010-12-18 17:12 ` Rob Herring
2010-12-18 18:17 ` Måns Rullgård
2010-12-18 21:59 ` Wolfgang Denk
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox