* [PATCH] common checksum code uses post-sh2 instruction
@ 2015-09-02 13:36 Rob Landley
2015-09-04 15:39 ` Rich Felker
0 siblings, 1 reply; 2+ messages in thread
From: Rob Landley @ 2015-09-02 13:36 UTC (permalink / raw)
To: linux-sh
From: Rob Landley <rob@landley.net>
The common checksum code uses shld which sh2 hasn't got, so unwind it.
Note: http://nommu.org/jcore added SHAD and SHLD, and if anybody's
building for an actual sh2 they'll either need gcc 4.7 or earlier or
a fix to https://gcc.gnu.org/bugzilla/show_bug.cgi?idT089 but
when regression testing against the actual historical hardware,
you need this.
Signed-off-by: Rob Landley <rob@landley.net>
---
arch/sh/lib/checksum.S | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/arch/sh/lib/checksum.S b/arch/sh/lib/checksum.S
index 356c8ec..7f0fc6b 100644
--- a/arch/sh/lib/checksum.S
+++ b/arch/sh/lib/checksum.S
@@ -88,8 +88,9 @@ ENTRY(csum_partial)
2:
! buf is 4 byte aligned (len could be 0)
mov r5, r1
- mov #-5, r0
- shld r0, r1
+ shlr2 r1 ! mov #-5, r0
+ shlr2 r1 ! shld r0, r1
+ shlr r1
tst r1, r1
bt/s 4f ! if it's =0, go to 4f
clrt
@@ -288,8 +289,9 @@ DST( mov.w r0,@r5 )
addc r0,r7
2:
mov r6,r2
- mov #-5,r0
- shld r0,r6
+ shlr2 r6 ! mov #-5, r0
+ shlr2 r6 ! shld r0, r6
+ shlr r6
tst r6,r6
bt/s 2f
clrt
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] common checksum code uses post-sh2 instruction
2015-09-02 13:36 [PATCH] common checksum code uses post-sh2 instruction Rob Landley
@ 2015-09-04 15:39 ` Rich Felker
0 siblings, 0 replies; 2+ messages in thread
From: Rich Felker @ 2015-09-04 15:39 UTC (permalink / raw)
To: linux-sh
On Wed, Sep 02, 2015 at 08:36:20AM -0500, Rob Landley wrote:
> From: Rob Landley <rob@landley.net>
>
> The common checksum code uses shld which sh2 hasn't got, so unwind it.
>
> Note: http://nommu.org/jcore added SHAD and SHLD, and if anybody's
> building for an actual sh2 they'll either need gcc 4.7 or earlier or
> a fix to https://gcc.gnu.org/bugzilla/show_bug.cgi?idT089 but
> when regression testing against the actual historical hardware,
> you need this.
>
> Signed-off-by: Rob Landley <rob@landley.net>
> ---
>
> arch/sh/lib/checksum.S | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/arch/sh/lib/checksum.S b/arch/sh/lib/checksum.S
> index 356c8ec..7f0fc6b 100644
> --- a/arch/sh/lib/checksum.S
> +++ b/arch/sh/lib/checksum.S
> @@ -88,8 +88,9 @@ ENTRY(csum_partial)
> 2:
> ! buf is 4 byte aligned (len could be 0)
> mov r5, r1
> - mov #-5, r0
> - shld r0, r1
> + shlr2 r1 ! mov #-5, r0
> + shlr2 r1 ! shld r0, r1
> + shlr r1
Is this a performance-critical code path (e.g. for networking)? If
not, I like your current solution best because it's simple and
#ifdef-free. But if it is critical and the shld is faster, perhaps
#ifdef for SH3+ would be appropriate, and we could expand to include
J2 in the #ifdef with the J2 support patches?
Rich
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2015-09-04 15:39 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-02 13:36 [PATCH] common checksum code uses post-sh2 instruction Rob Landley
2015-09-04 15:39 ` Rich Felker
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).