From: Segher Boessenkool <segher@kernel.crashing.org>
To: Gabriel Paubert <paubert@iram.es>
Cc: linuxppc-dev@ozlabs.org, paulus@samba.org,
LKML <linux-kernel@vger.kernel.org>,
Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH] add strncmp to PowerPC
Date: Mon, 3 Mar 2008 20:08:59 +0100 [thread overview]
Message-ID: <ecb965a1e5feba94cd97fb4b9535908e@kernel.crashing.org> (raw)
In-Reply-To: <20080303095443.GB27105@iram.es>
>> Even if it was logically faster (which I still doubt) it's a hell of
>> a lot
>> of cache lines to waste.
Yeah, 1 on 64-bit and 3 on 32-bit, that's a terrible lot.</sarcasm>
> Indeed, but there are some corner cases that the C code handles. Like
> a length of 0 which may lead to infinite loop in the asm code.
>
> OTOH, I'm a bit surprised by the extsb instructions in the compiler
> generated
> code. We don't compile with -fsigned-char, do we? The clrldi
> instructions are also extremely stupid.
Those are both necessary to be equivalent to the C code, which uses
signed char explicitly. It is generally considered a Good Thing(tm)
for the compiler to generate assembler code equivalent to the C code,
even if the C code is wrong.
> Now that I think a bit more about it, I believe that the C version is
> incorrect
It is. It's a great entry for the IOCCC as well.
I just tested the following (can't guarantee it's correct, just a PoC):
int strncmp(const char *s1, const char *s2, unsigned long /*size_t*/
len)
{
while (len--) {
unsigned char c1, c2;
c1 = *s1++;
c2 = *s2++;
int cmp = c1 - c2;
if (cmp)
return cmp;
if (c1 == 0 || c2 == 0)
break;
}
return 0;
}
which generates (with GCC-4.2.3)
strncmp:
addi 5,5,1
mtctr 5
.L2:
bdz .L11
lbz 0,0(3)
addi 3,3,1
lbz 9,0(4)
addi 4,4,1
cmpwi 7,0,0
subf. 0,9,0
cmpwi 6,9,0
bne- 0,.L4
beq- 7,.L4
bne+ 6,.L2
.L4:
mr 3,0
blr
.L11:
li 0,0
mr 3,0
blr
which isn't horrid, although it does some weirdish things obviously.
Current GCC-4.4.0 generates
strncmp:
addi 5,5,1
mr 10,3
mtctr 5
li 11,0
bdz .L7
.p2align 4,,15
.L4:
lbzx 0,10,11
lbzx 9,4,11
addi 11,11,1
subf. 3,9,0
cmpwi 6,9,0
cmpwi 7,0,0
bnelr 0
beqlr 7
beqlr 6
bdnz .L4
.L7:
li 3,0
blr
which is about as good as it can get (well, it didn't realise you
only need to test one of c1, c2 for zero. Did I say this was just
proof-of-concept code?)
Segher
WARNING: multiple messages have this Message-ID (diff)
From: Segher Boessenkool <segher@kernel.crashing.org>
To: Gabriel Paubert <paubert@iram.es>
Cc: paulus@samba.org, LKML <linux-kernel@vger.kernel.org>,
linuxppc-dev@ozlabs.org, Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH] add strncmp to PowerPC
Date: Mon, 3 Mar 2008 20:08:59 +0100 [thread overview]
Message-ID: <ecb965a1e5feba94cd97fb4b9535908e@kernel.crashing.org> (raw)
In-Reply-To: <20080303095443.GB27105@iram.es>
>> Even if it was logically faster (which I still doubt) it's a hell of
>> a lot
>> of cache lines to waste.
Yeah, 1 on 64-bit and 3 on 32-bit, that's a terrible lot.</sarcasm>
> Indeed, but there are some corner cases that the C code handles. Like
> a length of 0 which may lead to infinite loop in the asm code.
>
> OTOH, I'm a bit surprised by the extsb instructions in the compiler
> generated
> code. We don't compile with -fsigned-char, do we? The clrldi
> instructions are also extremely stupid.
Those are both necessary to be equivalent to the C code, which uses
signed char explicitly. It is generally considered a Good Thing(tm)
for the compiler to generate assembler code equivalent to the C code,
even if the C code is wrong.
> Now that I think a bit more about it, I believe that the C version is
> incorrect
It is. It's a great entry for the IOCCC as well.
I just tested the following (can't guarantee it's correct, just a PoC):
int strncmp(const char *s1, const char *s2, unsigned long /*size_t*/
len)
{
while (len--) {
unsigned char c1, c2;
c1 = *s1++;
c2 = *s2++;
int cmp = c1 - c2;
if (cmp)
return cmp;
if (c1 == 0 || c2 == 0)
break;
}
return 0;
}
which generates (with GCC-4.2.3)
strncmp:
addi 5,5,1
mtctr 5
.L2:
bdz .L11
lbz 0,0(3)
addi 3,3,1
lbz 9,0(4)
addi 4,4,1
cmpwi 7,0,0
subf. 0,9,0
cmpwi 6,9,0
bne- 0,.L4
beq- 7,.L4
bne+ 6,.L2
.L4:
mr 3,0
blr
.L11:
li 0,0
mr 3,0
blr
which isn't horrid, although it does some weirdish things obviously.
Current GCC-4.4.0 generates
strncmp:
addi 5,5,1
mr 10,3
mtctr 5
li 11,0
bdz .L7
.p2align 4,,15
.L4:
lbzx 0,10,11
lbzx 9,4,11
addi 11,11,1
subf. 3,9,0
cmpwi 6,9,0
cmpwi 7,0,0
bnelr 0
beqlr 7
beqlr 6
bdnz .L4
.L7:
li 3,0
blr
which is about as good as it can get (well, it didn't realise you
only need to test one of c1, c2 for zero. Did I say this was just
proof-of-concept code?)
Segher
next prev parent reply other threads:[~2008-03-03 19:09 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-29 16:04 [PATCH] add strncmp to PowerPC Steven Rostedt
2008-03-01 3:04 ` Benjamin Herrenschmidt
2008-03-01 3:04 ` Benjamin Herrenschmidt
2008-03-01 3:56 ` Steven Rostedt
2008-03-01 3:56 ` Steven Rostedt
2008-03-03 9:54 ` Gabriel Paubert
2008-03-03 9:54 ` Gabriel Paubert
2008-03-03 10:10 ` Andreas Schwab
2008-03-03 10:10 ` Andreas Schwab
2008-03-03 19:08 ` Segher Boessenkool [this message]
2008-03-03 19:08 ` Segher Boessenkool
2008-03-05 4:03 ` Paul Mackerras
2008-03-05 4:03 ` Paul Mackerras
2008-03-05 5:26 ` Segher Boessenkool
2008-03-05 5:26 ` Segher Boessenkool
2008-03-05 5:39 ` Paul Mackerras
2008-03-05 5:39 ` Paul Mackerras
2008-03-05 7:01 ` Segher Boessenkool
2008-03-05 7:01 ` Segher Boessenkool
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ecb965a1e5feba94cd97fb4b9535908e@kernel.crashing.org \
--to=segher@kernel.crashing.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=paubert@iram.es \
--cc=paulus@samba.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.